AITopics | hierarchical classification problem

Collaborating Authors

hierarchical classification problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating machine learning models in non-standard settings: An overview and new findings

Hornung, Roman, Nalenz, Malte, Schneider, Lennart, Bender, Andreas, Bothmann, Ludwig, Bischl, Bernd, Augustin, Thomas, Boulesteix, Anne-Laure

arXiv.org Machine LearningOct-23-2023

Estimating the generalization error (GE) of machine learning models is fundamental, with resampling methods being the most common approach. However, in non-standard settings, particularly those where observations are not independently and identically distributed, resampling using simple random data divisions may lead to biased GE estimates. This paper strives to present well-grounded guidelines for GE estimation in various such non-standard settings: clustered data, spatial data, unequal sampling probabilities, concept drift, and hierarchically structured outcomes. Our overview combines well-established methodologies with other existing methods that, to our knowledge, have not been frequently considered in these particular settings. A unifying principle among these techniques is that the test data used in each iteration of the resampling procedure should reflect the new observations to which the model will be applied, while the training data should be representative of the entire data set used to obtain the final model. Beyond providing an overview, we address literature gaps by conducting simulation studies. These studies assess the necessity of using GE-estimation methods tailored to the respective setting. Our findings corroborate the concern that standard resampling methods often yield biased GE estimates in non-standard settings, underscoring the importance of tailored GE estimation.

concept drift, estimation, training data, (17 more...)

arXiv.org Machine Learning

2310.15108

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > New York (0.04)
Europe > Portugal > Braga > Braga (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education (1.00)
Banking & Finance (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(2 more...)

Add feedback

Hierarchical confusion matrix for classification performance evaluation

Riehl, Kevin, Neunteufel, Michael, Hemberg, Martin

arXiv.org Artificial IntelligenceJun-15-2023

In this work we propose a novel concept of a hierarchical confusion matrix, opening the door for popular confusion matrix based (flat) evaluation measures from binary classification problems, while considering the peculiarities of hierarchical classification problems. We develop the concept to a generalized form and prove its applicability to all types of hierarchical classification problems including directed acyclic graphs, multi path labelling, and non mandatory leaf node prediction. Finally, we use measures based on the novel confusion matrix to evaluate models within a benchmark for three real world hierarchical classification applications and compare the results to established evaluation measures. The results outline the reasonability of this approach and its usefulness to evaluate hierarchical classification problems. The implementation of hierarchical confusion matrix is available on GitHub.

artificial intelligence, classification, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.09461

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
Oceania > Australia (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Perera

AAAI ConferencesFeb-8-2022, 11:09:42 GMT

Data intensive solutions, such as solutions that include machine learning components, are becoming more and more prevalent. The standard way of developing such solutions is to train machine learning models with manually annotated or labeled data for a given task. This methodology assumes the existence of ample human annotated data. Unfortunately, this is often not the case, due to imbalanced distribution of classes and lack of human annotation resources. This challenge is exasperated when thousands of hierarchical classes are introduced.

hierarchical classification problem, perera, training data, (2 more...)

AAAI Conferences

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Distribution-Calibrated Hierarchical Classification

Dekel, Ofer

Neural Information Processing SystemsDec-31-2009

While many advances have already been made in hierarchical classification learning, wetake a step back and examine how a hierarchical classification problem should be formally defined. We pay particular attention to the fact that many arbitrary decisionsgo into the design of the label taxonomy that is given with the training data. Moreover, many hand-designed taxonomies are unbalanced and misrepresent the class structure in the underlying data distribution. We attempt to correct these problems by using the data distribution itself to calibrate the hierarchical classificationloss function. This distribution-based correction must be done with care, to avoid introducing unmanageable statistical dependencies into the learning problem. This leads us off the beaten path of binomial-type estimation andinto the unfamiliar waters of geometric-type estimation. In this paper, we present a new calibrated definition of statistical risk for hierarchical classification, anunbiased estimator for this risk, and a new algorithmic reduction from hierarchical classification to cost-sensitive classification.

artificial intelligence, machine learning, taxonomy, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry:

Leisure & Entertainment (0.47)
Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback